AD-754  751 


MULTIWAY  CONTINGENCY  TABLE  ANALYSIS 
APPLIED  TO  THE  CLASSIFICATION  OF  MULTI¬ 
VARIATE  DICHOTOMOUS  POPULATIONS 


S .  Ku 1 1  b  ac  k 

George  Washington  University 


Prepared  for: 

Office  of  Naval  Research 

9  January  1973 


DISTRIBUTED  BY: 


(I.  S.  DEPARTMENT  OF  COMMERCE 

5285  Port  Royal  Road,  Springfield  Va.  22151 


iwn.*r  grapw*; . 


ftWjypBB  w  >n—  u„>t. 


AZ-7S47S/ 


MULTIWAY  CONTINGENCY  TABLE  ANALYSIS  APPLIED  TO  THE 
CLASSIFICATION  OF  MULTIVARIATE  DICHOTOMOUS  POPULATIONS 


by 

S.  KULLBACK 


TECHNICAL  REPORT  NO.  4 
January  9,  1973 


PREPARED  UNDER  CONTRACT  N00014-67-A-0214-0015 
(NR-042-267) 

OFFICE  OF  NAVAL  RESEARCH 

Herbert  Solomon,  Project  Director 


/DDC 

lUy  feb  5  1973 


lEtoTbll 

E 


Reproduction  in  Whole  or  in  Part  is  Permitted  for 
any  Purpose  of  the  United  States  Government 


Reproduced  by 

NATIONAL  TECHNICAL 
INFORMATION  SERVICE 

U  S  Department  ef  Commerce 
Springfield  VA  22131 


for  pi 

cJi.Vii’1 


lK  pfOVSdl 
•  cals;  he 
unlimited. 


DEPARTMENT  OF  STATISTICS 
THE  GEORGE  WASHINGTON  UNIVERSITY 
WASHINGTON,  D.  C.  20006 


Unclassified 

Security  Classification 

DOCUMENT  CONTROL  DATA  •  RAD 

(IwHlr  t /«««)» coHaw  at  llil*,  May  »l  ihlmi  W>iH|  naiiilw  hhiii  M  wnw<  n*«n  IN  wwll  ttpttl  It  cl §t$Hln) 


i  neeoar  titls 

MULTIWAY  CONTINGENCY  TABLE  ANALYSIS  APPLIED  TO  THE  CLASSIFICATION  OF 
MULTIVARIATE  DICHOTOMOUS  POPULATIONS 


«  DSICRIBTIVS  NOTCt  (Typt  •«  nptrt  M  InclualY*  4»f) 

TECHNICAL  REPORT 

AtiTNOKfl)  (Lm*i  mmi.  Ant  mm,  Inillmt) 

KULLBACK,  S. 


10.  A  VA  IL  AOIUTY/LIMITATION  NOTICCI 

Unlimited,  Reproduction  In  whole  or  in  part  is  permitted  for  any  purpose 
of  the  United  States  Government. 


it  abstract 

Multiway  contingency  tables,  or  cross-classifications  of  vectors  of 
discrete  random  variables,  provide  a  useful  approach  to  the  analysis  of 
multivariate  discrete  data.  In  the  particular  application  we  shall 
consider  herein,  the  individual  variates  are  dichotomous  or  binary. 

We  ehall  use  techniques  and  concepts  presented  and  discussed  by  the  author 
in  previous  papers.  We  note  that  the  procedures  and  analysis  are  not 
restricted  to  dichotomous  or  binary  data  but  are  also  applicable  to 
polychotomous  variates.  The  procedure  we  shall  use  is  based  on  the  principle 
of  minimum  discrimination  Information  estimation  applied  to  the  analysis  of 
multiway  contingency  tables.  It  yields  results  practically  equivalent  to 
procedures  proposed  by  other  investigators.  When  the  minimum  discrimination 
Information  estimates  provide  a  satisfactory  fit  to  a  set  of  data,  a  complete 
analysis,  including  significance  tests  and  estimates  describing  the  pattern 
of  observations  is  provided. 

DD  1473 

I 


11  IBONIORINO  MILITARY  ACTIVITY 

Office  of  Naval  Research 
Statistics  &  Probability  Program 
Arlington,  Va.  22217 


January  9,  1973 


•  «.  CONTRACT  OR  BRANT  NO. 

N00014-67-A-0214-0015 


a  RRCJBCT  NO. 

NR-042-267 


’  NUMBtAfU 

14 

II.  gThtR  Rf’ORT  HOC*;  (Any  •Bwiwaitm  Mm,  ka  MilfiW 


1  ORIOINATIN  0  ACTIVITY  rClVMll  tulhtt) 

THE  GEORGE  WASHINGTON  UNIVERSITY 

DEPT.  OF  STATISTICS  WASHINGTON,  D.  C.  20006 


Ji  vunly  riant  ilit  itlion 


nr- 

K£Y  VOAOft 

t-l'IK  A 

loll 

K  0 

|"  L!Md*C*  t. 

L 

nr»L»t 

VfT 

TOLU 

*T 

KG'.* 

wT 

CONTINGENCY  TABLES 

DICHOTOMOUS  POPULATIONS 

MULTIVARIATE 

INSTRUCTIONS 


1.  ORIGINATING  ACTIVITY:  Enter  th«  nan*  and  addraaa 
of  th*  contractor,  subcontractor,  grant**.  Department  of  Do- 
fan**  activity  or  other  organlrallon  ( corporal*  author)  laaulng 
the  report. 

la.  REPORT  SECURITY  CLASSIFICATION:  Entar  tha  over- 
atl  aacurlty  classification  of  tha  report.  Indicate  whether 
“Restricted  Data"  la  Included.  Marking  la  to  ha  In  accord¬ 
ant*  with  appropriate  aecurlty  regulation*. 

lb.  GROUP:  Automntlc  downgrading  la  opacified  In  DoD  Dl* 
rectlve  5200. 10  ond  Armed  Force*  Industrial  Manual.  Enter 
the  group  number.  Also,  when  applicable,  show  that  optional 
marking*  have  b««n  used  tor  Group  3  ar.d  Group  4  as  author* 
Istd. 

3.  REPORT  TITLE:  Erter  the  complete  report  title  In  all 
capital  letlrn.  Titles  In  *11  case*  should  ba  unclassified. 

If  a  aienninrfiit  till*  cannot  be  uelected  without  claaatfJce- 
tlon,  show  till*  clui.tlficutlon  in  all  capital*  In  parenthesis 
Immndlotaly  following  the  title. 

4.  DESCRIPTIVE  NOTES:  If  appropriate,  enter  the  type  of 
report,  e.g.,  Interim,  progress,  nummary,  annual,  or  final. 

Glv*  the  Incluuiv*  dati-u  when  a  spacltlc  reporting  period  Is 
covered. 

5.  AUTIIOR(S):  Enter  the  namefe)  of  authrtfs)  as  shown  on 
or  In  tha  report.  Enlu  last  name,  fir ut  name,  middle  Initial, 

If  military,  uhow  rank  und  brunch  of  service.  The  name  of 
the  principal  *.<thor  ,j  on  abuolute  mliiimum  requiremonl. 

ft.  REPORT  DATE:  Enter  tire  dole  of  tiro  report  as  day, 
monttg  year,  or  nionth,  yeur.  If  more  than  on*  dat*  appears 
on  the  report,  uun  duto  of  publication. 

7*.  TOTAL  NUMOKR  Ol-'  PAGES:  The  total  psg*  count 
should  follow  normal  paglnntlon  procedures,  La.,  antsr  the 
number  of  pugon  containing  Information. 

76  NUMUER  OF  REFERENCES  Enter  the  total  number  of 
rcfeti-ncco  t  lied  In  the  ripon. 

So.  CONTRACT  OR  GRANT  NUMBER:  If  appropriate,  enter 
th*  applicable  .i  i.nbrr  of  the  contract  or  grant  under  which 
th*  report  w  in  willtcn. 

(6,  8c,  8»  8.1.  PROJECT  NUMBER:  Enter  tie  spproprlsts 
mlliisry  depart nent  identification,  such  aa  project  number, 
subproject  nunirar,  uystem  numbers,  task  number,  etc. 

9*.  ORIGINATOR'S  REPORT  NUMDER(S):  Enter  the  offi¬ 
cial  report  number  by  which  th*  document  will  bo  Identified 
and  controlled  by  th*  originating  activity.  This  number  must 
bo  unique  to  this  report. 

96  OTHER  REPORT  NUMUERIS):  It  the  report  haa  boos 
aaalgnrd  any  other  report  numbers  (althar  by  tha  originator 
or  by  tha  sponsor.),  aluo  enter  this  numborfe). 

10.  AVAILAUII.ITY/LIMITATION  NOTICES:  Enter  any  Ham 
Rations  on  frmiher  ditto  min*  tlon  of  the  rtporl,  other  then  thoao 


DD  .!2K.  1473  (BACK) 


lapooed  by  aecurlty  clasalflcatioe,  using  standard  statements 
such  as: 

<1>  "Outlined  requestors  may  obtain  coplao  of  tbla 
report  from  DDC." 

(2)  “Foreign  nnnouneement  and  dieeeml nation  of  thlo 
report  by  DDC  It  not  authorized." 

(3)  “U.  S.  Government  agoncle*  may  obtain  coplao  of 
thlo  report  directly  from  DDC.  Other  qualified  DDC 
user*  aha]  I  request  through 


(4)  "U.  S.  military  agencies  may  obtain  coplao  of  thlo 
report  directly  from  DDC.  Other  qualified  uoora 
shall  request  through 

•• 

- ,  _  .  -  -  -  • 

(5)  “Alt  distribution  of  this  report  la  controlled.  Qual¬ 
ified  DDC  users  ahull  rep  eat  through 


If  the  report  has  been  fnrninhed  to  th*  Office  of  Technical 
Services,  Department  of  Conunurce,  for  eel*  to  th*  public,  indi¬ 
cate  this  fact  and  enter  Hie  price,  If  known 

IL  SUPPLEMENTARY  NOTES:  Ut*  for  additional  explana¬ 
tory  notes. 

11  SPONSORING  MILITARY  ACTIVITY!  Emer  th*  name  of 
th*  departmental  project  office  or  laboratory  sponsoring  (pay 
ing  tor)  the  rstsarch  end  development.  Include  addrees. 

13.  ABSTRACT:  Filler  an  abstract  giving  a  brief  and  factual 
Summary  oj  the  document  Indicative  of  the  report,  even  though 
it  mey  aluo  appear  elsewhere  in  Hie  body  of  the  technical  re¬ 
port,  If  additional  apace  l«  required,  ■  continuation  ahaot  shall 
b*  attached. 

It  l.r  highly  deulrable  that  Ilia  abstract  of  clsootflod  reports 
b*  unclaanified.  Each  paragraph  of  tha  obatracl  (hell  end  with 
on  Indication  of  Uio  military  security  nullification  of  th*  In¬ 
formation  In  th*  paragraph,  repreaanled  aa  (TS).  (St.  (C),  at  (V). 

Thera  I*  no  limitation  on  th*  length  of  (he  abstract.  How- 
over,  the  suggested  length  la  from  ISO  to  225  word*. 

14.  KEY  WORDS:  Key  word*  are  technically  meaningful  toms 
or  short  phrases  (hat  characterize  a  report  and  mey  bo  uaod  a* 
Index  entries  for  cataloging  Uie  report.  Key  word*  mutt  b* 
selected  so  that  no  security  classification  le  required.  Identi¬ 
fier*.  euch  ae  equipment  model  designation,  trade  nemo,  afiUtary 
project  cod*  name,  geographic  location,  may  b*  used  aa  key 
word*  but  will  be  followed  by  an  Indication  of  technical  con- 
toil.  Th*  assignment  of  Hake,  raise,  and  weight*  U  optional. 


% 


Unclassified 
Security  Classification 


Multiway  Contingency  Table  Analysis  Applied  to  the  Classification 
of  Multivariate  Dichotomous  Populations 


by 

S.  Kullback 

Introduction 

Multiway  contingency  tables,  or  cross -classifications  of  vectors  of 
discrete  random  variables,  provide  a  useful  approach  to  the  analysis  of 
multivariate  discrete  data.  In  the  particular  application  we  shall 
consider  herein,  the  individual  variates  are  dichotomous  or  binary. 

We  shall  use  techniques  and  concepts  presented  and  discussed  in  [4] 
and  [6].  We  note  that  the  procedures  and  analysis  are  not  restricted 
to  dichotomous  or  binary  data  but  are  also  applicable  to  polychotomous 
variates . 

For  background  on  the  study  and  problem  which  gave  rise  to  the  data 
we  shall  analyze  see  [ 8 J .  In  (j],  procedures  further  developed  in  [4] 
and  [6],  were  applied  to  problems  of  multivariate  binary  data  in  infor¬ 
mation  systems,  such  as  communication,  pattern  recognition,  and  learning 
systems.  In  [l]  there  is  a  review  of  methods  and  models  for  the  analysis 
of  multivariate  binary  data.  Solomon' s  data,  which  we  shall  analyze 
herein,  is  given  as  a  typical  example.  In  [7]  there  is  developed  a 
model  based  on  a  set  of  orthogonal  polynomials  and  applied  to  Solomon' s 
data.  We  remark  that  the  procedure  we  shall  use,  based  on  the  principle 
of  minimum  discrimination  information  estimation  applied  to  the  analysis 
of  multiway  contingency  tables  yields  a  result  practically  equivalent  to 
that  in  [7] • 


1 


"Multivariate  data  analysis  needs  a  large  and  flexible  class  of 
hypothetical  distributions  of  free  variables  indexed  by  the  values  of 
fixed  variables.  From  this  class,  appropriate  subfamilies  would  be 
chosen  for  fitting  to  specific  data  sets"  [2].  The  principle  of  minimum 
discrimination  information  estimation,  and  its  basis,  the  minimum 
discrimination  information  theorem  which  is  quite  general  in  its 
formulation,  lead  to  exponential  families  of  distributions  [4],  [5]; 

[6],  The  exponential  families  have  very  useful  and  desirable  statis¬ 
tical  properties  and  contain  many  subfamilies  in  common  use  [2].  "The 
data  analytic  attitude  to  models  is  empirical  rather  than  theoretical... 
when  detailed  theoretical  understanding  is  unavailable,  a  more  empirical 
attitude  is  natural,  so  that  estimation  of  parameters  in  models  should 
be  seen  less  as  attempts  to  discover  underlying  truth  and  more  as  data 
calibrating  devices  which  make  it  easier  to  conceive  of  noisy  data  in 
terms  of  smooth  distributions  and  relations .  Exponential  families  are 
viewed  here  as  intended  for  use  in  the  empirical  mode.  With  a  given 
data  set,  a  variety  of  models  may  be  tried  on,  and  one  selected  on  the 
ground  of  looks  and  fit"  [2].  When  the  minimum  discrimination  informa¬ 
tion  estimates  provide  a  satisfactory  fit  to  a  set  of  data,  a  complete 
analysis,  including  significance  tests  and  estimates  describing  the 
pattern  of  observations  is  provided. 

Solomon's  Data 

A  total  of  2982  high-school  seniors  were  given  an  attitude  question¬ 
naire  to  assess  their  attitude  towards  science.  The  students  were  also 
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classified  on  the  basis  of  an  IQ  test  into  high  IQ,  the  upper  half,  and 
low  IQ,  the  lower  half.  The  sixteen  possible  response  vectors  to  each 
of  four  agree-disagree  responses  were  tabulated.  The  data  i6  given  in 
table  1,  where  x^,  x2,  x^,  x^  indicate  the  statements  ([8,  p.4l6]), 
agree  and  disagree  were  coded  as  1  and  0  respectively,  and  listed  as 
low  IQ  and  high  IQ.  The  problem  of  interest  was  to  determine  whether  the 
response  vectors  could  be  used  as  a  basis  for  classifying  the  Btudents 
into  one  of  two  classes  and  evaluate  possible  classification  procedures. 

Contingency  Table  Analysis 

We  shall  treat  the  data  as  a  five-way  2X2X2X2X2  contingency  table, 
denoting  the  original  observations  by  x(hijk4),  where 
h=l,  low  IQ,  h=2,  high  IQ  ; 

i=l,  response  to  x^  coded  0,  i=2,  response  to  x^  coded  1$ 

j=l,  response  to  x^  coded  0,  j=2,  response  to  Xg  coded  1; 

k=l,  response  to  x g  coded  0,  k=2,  response  to  x^  coded  1} 

1=1,  response  to  x^  coded  0,  i=2,  response  to  x^  coded  1. 

As  a  first  overview  of  the  data  to  determine  the  marginals  and  their 
related  interaction  parameters  which  may  be  considered  to  furnish  sigiifi- 
cant  values  in  the  log-linear  representation  of  the  exponential  family 
of  the  estimates  [6],  we  list  in  table  2a,  Analysis  of  Information,  a 
sequential  study  of  interaction  and  effect  type  measures  [4],  [6]. 

We  remark  that  the  first  estimate  is 

x*(hijki)  =  x(h*  •  •  •  )x(  «i,)ki)/n 

GL 
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and  the  minimum  discrimination  information  statistic  (interaction  type 
measure ) 


2I(x:x£)  =  2^  *(hlJM)ln  ^TT^TijST 

tests  a  null  hypothesis  that  the  IQ  groupings  are  homogeneous  over  the 
sixteen  response  vectors  [5, Chap. 8],  [4].  This  null  hypothesis  is 
rejected  and  the  subsequent  study  of  effect  and  interaction  type  measures 
is  an  attempt  to  get  a  good  fit  to  the  data  and  account  for  the  variation. 
Although  the  association  between  IQ  and  the  response  to  the  first  state¬ 
ment  is  not  significant,  2l(x*:x*)  =  2.376,  1D.F.,  it  was  decided  to 
examine  in  detail  the  estimate  x*(hijkX)  whose  numerical  values  are 
given  in  table  1.  (We  remark  that  it  may  be  shown  that 

and  tests  a  null  hypothesis  that  IQ  is  homogeneous  over  the  response  to 

the  first  question).  The  estimate  x*(hijk0)  was  selected  because  it 

6 

does  not  differ  significantly  from  the  observed  values,  2l(x:x£)  = 

16.307,  11D.F.  (represents  an  acceptable  fit),  is  symmetric  with  respect 
to  the  four  statements,  and  is  comparable  to  the  first-order  model 
estimate  of  [7],  whose  values  are  also  listed  in  table  1. 

From  the  log-linear  representation  in  figure  1  [6],  we  obtain  the 
parametric  representation  for  the  log-odds  (low  IQ/high  IQ) 

in(x£(lijk£)/x#(2ijlu)) 

over  the  sixteen  response  vectors  as  given  in  table  3a.  Thus,  for  example 
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x*( 11111 ) 

/n  x*( 21111)  = 


x*  -hi 


n  n 


11'  Tli 


> 


that  is,  a  linear  regression  of  the  log-odds  in  terms  of  a  constant  x^ 
and  the  main  effects  of  each  component  of  the  response  vector,  namely, 
t^,  t^,  x^,  t^.  The  numerical  values  of  the  log-odds  and  the  para¬ 
meters  are  easily  obtained  from  the  entries  in  the  computer  output  and 
are  also  given  in  table  3a  [6]. 

We  note  from  table  3a  that 

x*(lijkl)  x*(lijk2) 

in  x*(2ijkl)  '  in  x*(2ijk2)  "  T11  =  0,3558  > 
e  g 


that  is,  a  change  from  disagree  to  agree  on  the  fourth  statement  is 

associated  with  an  increase  of  0.3338  in  the  log-odds  (low  IQ/high  IQ). 

Note  also  that  x!j^  represents  the  association  between  IQ  and  response 

to  the  fourth  statement  as  measured  by  the  log-cross-product  -  ratio 

hi  x*(lijkl)x*(2ijk2) 

T11  "  in  x*(2ijkl)x*(lijk2) 

c  c 

and  Is  the  same  for  all  eight  levels  of  the  responses  to  statements  one, 

two  and  three. 

Similarly,  it  is  found  that 

xMliJU)  x*(liJ2i) 

4n  x£(2ijl.e)  "  in  x*(2iJ21)  =  T11  "  0,5411  * 

x*(lilkl)  x*(li2ki)  .. 

*n  x*(2ilia)  '  in  x*(212ki)  "  T11  “  0,:12^0  > 

x*(lljki)  x*(l2Jki)  h± 

/n  x*(21JkJJ  "  /n  x*(22jki!)  "  T11  =  •°,205°  • 

G  6 
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Since  x(l* • • • )  =  x*(l***«)  =  1491,  and  x(2»***)  =  x*(2....)  =  1491/ 

6  G 

we  assign  a  respcnse  vector  (ijkii)  to  the  region 

E^:  classify  as  population  h=l  (low  IQ),  when 
x*(lijkO 

in  x*(2ijki)  =  0 

and  to  the  complementary  region 

Eg!  classify  as  population  h=2  (high  IQ),  when 

x*(lijW) 
m  x*(2ijkl)  <  0  * 

If  we  set 

x»(2iJU) 

^  '  (1J«)£Ei  ll*l  ’  ‘'a<El)  "(ljL)eEl  ' 

then  the  probability  of  error  of  the  classification  procedure  is 

[5/PP*4,69,8o] 

Prob  Error  =  PM2(E1)+qp1(E2)  -  ( ^(E^+n^Eg))/^ 

since  here  p  =  x(2* •  •  • )/ 2982  =  ^  ,  q  =  x(l,,#,)/2982  =  -  . 

The  relevant  computations  with  x*(hijkii }  are  given  in  table  4(b) 
and  show  that  the  Prob.  Error  =  0.444.  The  corresponding  computations 
with  the  original  data  x(hikJJ)  are  given  in  table  4(a)  and  yield 
Prob.  Error  =  0.44l. 


Other  Estimates 

In  view  of  the  measure  of  the  effect  of  the  marginal  x(hi**i)  (and 
the  associated  interaction  parameters)  in  table  2a,  2l(x*:x£)  *=■  4.J16,  1D.F. 
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and  the  marginal  x(h*j*i),  2l(x£:x*)  «  3*l8l,  1D.F.,  the  estimate 
x*(hi,jkii)  fitting  the  marginals  x(*ijki),  x(h*J»«),  x(h**k*)>  x(hi**i) 
and  the  estimate  x£(hijk0  fitting  the  marginals  x(*ijki),  x(h**k*); 
x(hi**4),  x(h» J •  jt )  were  computed.  The  estimates  are  given  in  table  1 
and  the  relevant  analysis  of  information  given  in  table  2b. 

The  values  of  the  log-odds,  parametric  representation,  and  the 
values  of  the  associated  interaction  parameters  are  given  in  table  3b 
for  x^(hijkl)  and  in  table  3c  for  x^(hijki) .  Note  from  table  3b  that 


reflecting  the  interaction  of  the  responses  to  the  first  and  fourth 


statements • 


From  table  3c,  it  is  found  for  example,  that 


x*(lllkl) 


x$(lllk2)  w  hi*  hJje  ^ 

“  Tn  +  »in  +  T,,,  =  0.5806 


in  x*(211ki;  "  in  x^(211k2)  =  T11  +  TU1  +  Tlll 


in  x*(221kl)  "  £n  x*(221k2)  T11  Tlll  "  °*‘ 

w  w 


x*( 112kl)  X*(ll2k2) 

tn  x*(212ki;  "  *n  x*(212k2)  =  T11  +  Tlll  =  °*9571 
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x*(l22kl)  x*(l22k2) 

in  x£(222kl)  “  in  x^(222k2 )  =  T11  =  0,5595 

reflecting  the  interactions  of  the  responses  to  the  first,  second  and 
fourth  statements. 

The  computation  of  the  probability  of  error  using  the  estimates 
x^(hijkii)  and  x^(hijki)  is  shown  in  table  4(c)  and  4(d)  respectively, 
and  yields  probabilities  of  error  0.444  and  0.446. 

Measure  of  Divergence 

As  a  measure  of  the  divergence  between  the  low  IQ  and  high  IQ  observed 
and  estimated  values,  we  computed  the  values  of 

J(l,2)  -  |  2ZEZ(x(liJk^)-x(2ijW))in 

for  x(hijki),  x^(hijkii),  x*(hijkii),  x*(hijki)  [5,p.l50].  The  resulting 
values  and  their  ratios  to  the  respective  degrees  of  freedom  are  given 
in  table  5*  As  is  to  be  expected  from  the  properties  of  the  discrimina¬ 
tion  information  we  note  that 

J(l,2>x*)  <  J(l,2ix£)  <  J(l,2;x*)  <  J(l,2>x)  . 

However  the  ratio  to  the  respective  degrees  of  freedom  leads  to  the 
inequalities 

j(l,2jx)/D.F.  <  J(1,2jx*)/d.F.  <  J(l,2jx*)/D.F.  <  J(l,2jx*)/D.F. 

Remark 

Martin  and  Bradley  [7]  examined  Solomon's  data  in  terms  of  an  estimate 
they  called  a  first-order  or  linear  model.  These  estimated  values  are 
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given  in  table  1.  It  turns  out  that  although  the  underlying  approaches 
are  different,  the  Martin  and  Bradley  parameters  and  estimates  arc 
practically  the  same  as  those  for  x£(hi,)k0).  From  [7>pp .216-217]  we 
note  that 


x*{ 12222) 

£n  x*( 22222)  =  T1 


=  in 


1+ao+al+a2+a3+a4 

i-varwa4 


x*( 12221)  h 
tn  x*( 22221)  =  T1 

x*(!2212)'  h 
x*( 22212)  '  T1 


hJi  1+8L0+a'  +a''+a>  “a». 
+  t,,  =  in 


il+a2*a3~a4 


o  -tty* 

1+a. +a^+ag-a^+a^ 


+  =  £n 

11  1_Wa2+Va4 


x*(  12122) 
lU  x*( 22122)  *  T1 


+  Ty.ta!vvm 

11  l-a0-a1+a2-a5-a4 


*2(11222)  h 
tn  x*{ 21222)  =  T1 


,  .  l+a  -a.+a„+a,+at 

hi  o  12  3  4 

Til=  n  l-ao+a1-a2-a3-a4 


or  to  a  first  approximation 


T1  =  2ao+2al+2a2+2a3+2a4 

Tl+Tll  =  2ao+2a1+2a2+2a5 ”2a4 
Tl+Tll  =  2ao+2a1+2a2-2a5+2a4 

Tl+Tll  *  2ao+2al“2a2+2a5+2a4 
h  hi 

Tj+t^  =  2aQ-2a1+2a2+2aj+2a4  . 
It  is  found  that 
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The  values  of  the  parameters 
aQ  *  -0.042,  ^  =  0.049, 
so  that 

=  0.3338  =  0.334,  -4a4  =  O.328 
tJJ  =  0.3411  =  0.341,  -4a^  =  0.336 
*  0.1240  =  0.124,  -4a2  =  0.124 
=-0.2030  =  -0.203,  -4ax  =-0.196 

The  computation  for  the  probability  of  error  using  the  estimates  in 
[7]  are  shown  in  table  4(e)  and  yields  a  probability  of  error  0.445. 
(Martin  and  Bradley  give  a  value  of  the  risk  as  0.455). 
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hi! 

'll 


=  -4ai. 


hk 


•11 


=  -4 


"3 


*11 

hi 

'll 


=  -4a„ 


=  -4a, 


given  in  [7, table  3,P*  217]  are 


a2  =  -0.031,  =  -0.084,  a^  =  -0.082 
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s 

l 


\ 

\ 


f 


Table  2a 

Analysis  of  Information 


Marginals  Fitted 

Information 

D.F. 

a) 

x( .ljk/),x(h....) 

2l(x:x*)  *  68.369 
& 

15 

b) 

x(  .ljki),x(bl...) 

2l(x*:  x*)  =  2.376 

1 

)  2l(x:xg)  =  65.993 

14 

c) 

x(  .lJki),x(bl...)jx(h.J..) 

2l(x£:x*)  =  4.265 

— 

1 

2l(x:x*)  =  61.728 

13 

d)  x(  .ijki),x(bi...),x(h.j..),x(b..k.) 

,  2l(x*:x£)  =  25.230 

1 

2l(x:x*)  =  36.498 

12 

e) 

x(  .ljk/),x(bl. . .  ),x(h.J . .  ),x(b.  .k.  ),x(h. .  .1 ) 

2l(x*:x*)  =  20.191 

1 

2l(x:x£)  =  16.307 

11 

f) 

x(  .ljkf  ),x(h..k.),x(h...4 ),x(blj..) 

r 

2l(x*:x*)  =  3.016 
r  e 

1 

2l(x:x*)  =  13.291 

10 

g) 

x(  .ljkk),x(b...i),x(hij.  .),x(hi.k.) 

2l(x*:x*)  =  0.042 

8  f 

2l(x:x*)  =  13*249 

1  9 

m) 

x(  »ijki)ix(bij..)<x(bi.k.)jx(hi..i ) 

2l(x*:x*)  -  4.316 

al  g 

1 

2l(x:x*)  •  8*933 

m 

I  8 

n) 

x(  .ijkf  ),x(bij..),x(b...k.),x(hl..£),x(b.jk. ) 

21  =  0-985 

1 

2l(x:x*)  =  7-950 

1  7 
♦ 

1 

p) 

x(  .ijk/),x(hlj..),x(bi.k.),x(hi..^),x(h.jk.  ),»x(b.J..O 

2l(x*:x*)  =  3.181 
p  n 

1 

1 

— 

i 

• 

2l(x:x£)  =  4.769 

6 

1 

i 

q) 

x(  .ijkl/),x(bit)..)>x(hl.k.),x(bi.«£),x(ta.jk.  ),x(b.J  .1 ), 

2l(x*:x*)  =  0.219 
q  P 

1 

x(h. .ki) 

2l(x:x£)  =  4.550 

5 

r) 

x(  .ljk/),x(hl..i),x(h.j./),x(h..k/),x(bijk.) 

2l(x£:xi£)  =  0.346 

1 

i 

2l(x:x*)  =  4.204 

4 

f . 
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Analysis  of  Information  (continued) 


Marginals  Fitted 

Information 

D.F. 

2l(x:  x*)  =  4.204 
r 

4 

s)  x(  .ijkl)<x(h..ki),x(hijk.)>x(bij./) 

£l(x*:x*)  =  2.303 
s  r 

1 

2l(x:x^)  =  1.901 

3 

t)  x(  .ijki),x(hijk.),x(hi,).f  ),x(hi.ki) 

2l(x*:x*)  =  1.375 

n 

2l(x:x*)  =  0.326 

2 

u)  x(  .ijki),x(hijk.)»x(hij.i),x(hi.ki ),x(h.jki) 

2l(x*:x*)  =  0.361 

■ 

2l(x:x£)  =  0.165 

■ 

Table  2b 

Analysis  of  Information 


Marginals  Fitted 

Information 

D.F. 

e)  x( .ijki),x(hi...),x(h.<j..),x(h..k.),x(h...i) 

2l(x:x») 

=  16.307 

11 

v)  x(  .ijk/),x(h.j  . .  ),x(h.  .k.),x(hi.  .1 ) 

2l(x*:x*) 

=  3.755 

1 

2l(x:x*) 

'  v' 

=  12.572 

10 

w)  x(  .ijki),x(h..k.),x(hi..i),x(h.j.i) 

2l(x*:x*) 

'  w  v 

=  3.443 

1 

2l(x:  x*) 
w 

=  9.129 

9 
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x£(lijki) 
Log-odds  in  X.(21JW) 


ijki 

Parametric  representation 

1111 

■5 

x  4i 
+T11 

.^hk 

*11 

hi 

*u 

1112 

«5 

tTu 

11 

1121 

<5 

"S 

hi 

*11 

1122 

«5 

«u 

1211 

h 

T1 

*11 

h l 

*11 

1212 

A 

«K 

1221 

'X 

-5 

11 

1222 

h 

T1 

+Thl 

*11 

2111 

<1 

.  hk 
+xll 

*11 

2112 

<1 

11 

2121 

h 

T1 

< 

.  hi 
*11 

2122 

h 

X1 

^ll 

2211 

+T** 

11 

hi 

*11 

2212 

hk 

+xu 

2221 

h 

X1 

,  hi 
*11 

2222 

A- 

-0.3831, 

Thl  = 
11 

-0.2030, 

=  0.1240 

=  0.3411, 

j*  = 
11 

0.3338 

Table  3 a 


log-odds 

0.2128 

-0.1210 

-0.1284 

-0.4621 

0.0888 

-0.2450 

-0.2524 

-0.5861 

0.4158 

0.0820 

0.0746 

-0.2592 

0.2918 

-0.0420 

-0.0494 

-0.3851 
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Log -odds 


in 


x*(lijki) 

x*(2ijki) 


ijk/ 


Parametric  representation 


1111 

h 

T1 

^  hi 
+T11 

< 

,  hk 
+  T11 

< 

.  hiX 
**111 

1112 

h 

T1 

.  hi 
+T11 

< 

x  hk 
+T11 

1121 

h 

< 

< 

111 

T1 

1122 

h 

T1 

< 

1211 

h 

T1 

< 

< 

hi 

+T11 

+ThiX 

111 

1212 

’S 

< 

< 

1221 

< 

< 

+Th^ 

111 

1222 

h 

T1 

< 

2111 

<5 

< 

«hk 

11 

< 

2112 

$ 

+T?J1 

< 

2121 

h 

T1 

< 

2122 

Ti 

2211 

’l 

< 

< 

2212 

< 

2221 

h 

T1 

< 

2222 

h 

T1 

=  -0.3492,  =  -0.4o65,  =  0.1203 

^  =  0.3457,  tJJ  =  0.2680,  =  0.3789 


Table  3b 


log-odds 

0.3571 

-O.2898 

0.0115 

-0.6355 

0.2366 

-o.4ioi 

-0.1088 

-0.7557 

0.3847 

0.1167 

0.0390 

-0.2290 

0.2644 

-0.0036 

-0.0833 

-0.5492 
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{ijki:  in  odds  £  0} 


ao  vo  o\  Chfco 

O  C—  t“—  r-l  H 

in  in  VO  CM  H 

•  I  •  t  I 

t—  CO  O  COJ 

O  t—  J-  CM  CM 

CM  H  CM  c*- 


VO  O  H  in  CM 

in  i-h  On  cm  cm 

vo  o  -3-  vo  in 


-*  tn  Ov 
H  ov  in 
tn  H  CM 


ft  SI 


H  rl  H  C\j  H  H 

H  rH  H  r-J  CM  H 

H  CM  H  rH  H  CM 

H  H  CM  CM  CM  CM 


m  Ov  o 
in  h 


cm  f-  in 

N  4  CM 
H  CM 


£ir  SB 


•H 

H 

CM  -4; 
CO  #H 

11 

© 

tn 

200 

» 

r-l 

-4* 

a 

n 

•H 

1111 

1211 

1221 

2111 

2112 

2121 

2211 

2221 
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Table  4 
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Table 


Martin  and  Bradley 


E1 

x(liJW) 

x(21jkl') 

1111 

74.67 

60.33 

1211 

12.0? 

IO.98 

2111 

LT\ 

-* 

1 

207.50 

2112 

193 -45 

178.55 

2121 

239.17 

240.83 

2211 

37.74 

28.26 

39105 

726.45 

726. 45 
1^91 


\^2) 


1491-891.55 

1491 


Prob .  Error 


1  726.45+599.4$ 

2  l49l 


1325  .90 
29^2 


=  0.445 


Table  4(e) 
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W|H 


Divergence  Between  Low  IQ  end  High  IQ 
Observations  and  Estimates 


|  2XZ£(x(lijk4)-x(2iJk/))Jn  -  69.332 

69.132/15  -  4.6i/D.F. 

.  x*(lljk/) 

i  Eaz(xj(llJW)-xJ(21jkl))ln  ^gljfc|j  -  52.374 

6 

52.374/11  ■  4. 76/D  .P. 

xJ(llJW) 

EDX(x{(lijlt/)-xj(21jW))i!n  *  56.249 

56.249/10  -  5.62/D.F. 

,  x*(iijw) 

f  nXE(xj(lljU)-^(aijk/))ln  -  59.815 

59*815/9  -  6.65/D.F. 

Table  5 
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